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Abstract — A variable-length code is a flx-free code if no 
codeword is a prefix or a suffix of any other codeword. In a 
fix-free code any finite sequence of codewords can be decoded 
in both directions, which can improve the robustness to channel 
noise and speed up the decoding process. In this paper we prove 
a new sufficient condition of the existence of fix-free codes and 
improve the upper bound on the redundancy of optimal fix-free 
codes. 

Index Terms — Fix-free code, redundancy. 



I. Introduction 

LET p = {pi,...,pm} be the probability distribution 
of a source, and let C be a code for the source. The 
redundancy i? of a code C is defined as the difference between 
the average codeword length L{C) of this code and the entropy 
H{p) of the source. We denote the redundancy of an optimal 
fix-free code by i?/. 

Ahlswede et al. [1] have proved that < i?/ < 2. They 
have also shown that the lower bound on Rf cannot be 
improved. Later Ye and Yeung [6], [7] derived several upper 
bounds on R f in terms of partial information about the source 
distribution. The goal of this paper is to improve the upper 
bound on Rf from 2 to 4 — log2 5, which is approximately 
1.678. 

Let v„ = (fci , . . . , fc„) be a vector, where ki are nonnegative 
integers. By C(v„) denote a binary variable-length code 
containing ki codewords of length i, for each i — l,n. The 
Kraft sum of the vector v„ is the quantity 



k. 



Ahlswede et al. [1] conjectured that S'(v„) < | is a 
sufficient condition for the existence of a binary fix-free code 
C(v„). They proved that the conjecture is true in the weaker 
case when the Kraft sum is at most i. If the conjecture is 
true the upper bound on Rf can be improved to 3 — log2 3, 
which is approximately 1.415 [7]. Since the conjecture was 
made, several special cases of it were proven [4], [5], [7], [8], 
although the general conjecture still remains an open problem. 

In this paper we prove a new special case of the conjecture. 
We show that S{vn) < | implies the existence of a fix-free 
code C(v„) (Theorem 1). This result yields an improved upper 
bound for the redundancy of optimal fix-free codes (Theorem 
2). 
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II. New Sufficient Condition of the Existence 
OF Fix-Free Codes 

Let w be an arbitrary binary vector of length n. The binary 
vector composed of the first {last} p symbols of w is called 
p-prefix {p-suffix} of w and denoted by {w^}. We say that 
vector w has the form a* [3, where a, /? e {0, 1} if = a 
and = (3. 

Consider a binary variable-length fix-free code C(v„), 
where v„ = (/ci, . . . ,fc„). 

A vector w G {0, 1}" is called prefix free {suffix free} over 
code C(v„) if C(v„) does not contain any prefix {suffix} of 
w. 

By definition, put 

"^^(C) = {w|w is prefix-free over C and = 0}, 

^ J^(C) = {w|w is prefix-free over C and = 1}, 

i?(C) =" l?(C)ui fiC), 

F (C) = {w|w is suffix-free over C and = 0}, 

^ (C) = {w|w is suffix-free over C and = 1}, 

'F(C) = 'F°(C)u'r'(C). 

Let M be an arbitrary subset of {0, 1}". 
The set M is called right regular if all (n 



1) -suffixes of 

words from M are pairwise distinct, i.e., Vci, C2 G M, ci ^ C2 
implies c"^^ ^ c^^^ . 

Similarly, the set M is called left regular if all (n — 1)- 
prefixes of words from M are pairwise distinct, i.e., Vci, C2 G 



M, ci ^ C2 implies ""^ci ^' 



C2- 



Q\ Clearly, °i^(C) and li^(C) are right regular sets. Likewise, 



F (C) and F (C) are left regular sets. 

Let Ml and A/2 be arbitrary subsets of {0, 1}". By defini- 
tion, put 



Ml ® M2 {w G {0, 1} 



n+1 in 



w G Ml and w" G M2} 



The following lemma is obvious. 

Lemma 1: Suppose C(v„) is an arbitrary fix-free code; 
then is the set of all words of length n+1 

that can be added to C(v„) without violation of the fix-free 
property of the code. Moreover, is the set 

of all words of the form a-k (3 and length n+1 that can be 
added to C(v„) without violation of the fix-free property of 
the code. 

Lemma 2: Suppose Mi is a right regular subset of {0, 1}" 
and M2 is a left regular subset of {0, 1}"; then 



\Mi M2I > \Mi\ + \M2 \ - 2"" 



(2) 
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Proof: By Afj"~^^ denote the set of {n — l)-suffixes of 
words from Mi. In the same way, by ("^^^1/2 denote the 
set of {n — l)-prefixes of words from M2. Since Mi is 
right regular, it follows that |m|""^^| = \Mi\. Similarly, 
|("-i)M2| = \M2\. Since M^""^^ and (""^^Mz are subsets 
of {0,l}"-\ it follows that |m|"^^^ U^""!) M2I < 
Therefore, |Af}"~^^ n^""!) Afsl > + IA/2I - 2"-i. Let 
b denote an arbitrary element of Afj" ^•^ n'"^^' Af2- It now 
follows that there exist a, c G {0, 1} such that ah e Mi and 
be e Af2. Hence, abc e Afi (g) Af2- Thus, |A4^i A//2I > 
+ |Af2| - 2""^ This completes the proof. 

Theorem 1: If 5(v„) < |, then there exists a fix-free code 
C(v„). 

Proof: Clearly, it suffices to prove that iS'(v„) = |, implies 
the existence of a fix-free code C(v„). Let us consider three 
cases. 

1) ki = l 

2) fci = 0, fc2 = 2 

3) fci = 0, fc2 < 1 

In every case we construct the code C(v„) in n steps. On 
step t we add kt words of length t to the code. The input of 
step i is a code C(vt_i), the output is a code C(vf). Thus, 
on step n we construct C(v„). 

Proof of case 1: We shall now prove that S{vn) < | 
and fci — 1 imply the existence of a fix-free code C(v„). 
This claim is stronger than the assertion of the theorem. Put 
C(vi) = {0}. Suppose that a fix-free code C — C(vt_i) 
is constructed; we shall prove that on step t we can add kt 
words of length t to the code without violation of the fix-free 
property. By lemma 1, it is sufficient to prove that \^{C) (X> 
F{C)\> kt. Put 5 = 5(vf_i). Using Q, we get (5+|| < |. 
Hence, 

/fct <3*2*"2-(5*2*. (3) 

Now note that since e C(vt_i), it follows that °^^(C) = 
'F°(C) = 0. Therefore, i^(C) is right regular and^(C) is 
left regular It can be easily checked that \ F {C)\ = \F{C)\ = 
2*^^(1 — 5). The application of lemma 2 yields 

\P{C)®'F{C)\>i*2'-^ ~ 5*2K (4) 

Combining (|3} and 0, we obtain \P [C] (^'T {C)\ > h. This 
completes the proof of the first case of Theorem 1 . 

Proof of case 2: We shall now prove that 5(v„) < |, 
ki — and k2 — 2 imply the existence of a fix-free code 
C(v„). Again, our claim is stronger than the assertion of the 
theorem. Put C(v2) = {00, 11}. Suppose that a fix-free code 
C = C(vt_i) is constructed; we shall prove that |i^(C) (g) 
F{C)\ > kt. It is sufficient to prove that both inequalities (|3} 
and (0} are fulfilled. The proof of inequality (|3} is exactly the 
same as above, so we proceed to inequality @. 

Let us show that 'P{C) is right regular Assume the con- 
verse. Then there exists a vector b g {0, 1}*^^ such that both 
words Ob and lb are prefix free over C(vt_i). Let us consider 
the two cases ^b = and ^b = 1 separately. In the first case 
Ob is prefixed by the codeword 00. In the second case lb 
is prefixed by the codeword 11. Thus, we have come to a 
contradiction. By the same argument, F{C) is left regular. 



As above, \I^{C)\ = |'F(C)| = 2*-i(l - S). The application 
of lemma 2 yields (0). This completes the proof of the second 
case of Theorem 1 . 

Proof of case 3: Since ki = and ^2 < 1, it follows that 
the vector v„ can be uniquely represented as a sum of four 
vectors v,;', , v^j , vfj , such that 

' K = {k\,^...,k'J, i=l,2,3,4, 

5(v^) = 5(v3) = 5(v4) = i, 

If kl^ 0, then W >i,t' <t k^, = 0. 

Consider the following example of such representation. 

v„ = {0,0, 2, 1,2, 6, 20} 5(v„) = |, 

vi ={0,0,2,0,0,0,0} 5K) = |, 

v2 ={0,0,0,1,2,0,0} 5(v2j = f, 

vf, = {0,0,0,0,0,6,4} 5(v3) = |, 

v4 ={0,0,0,0,0,0,16} 5(v4) = |. 

We shall construct a code C(v„) that is a union of four 
codes C(v„) = COO(vi) U COi(v^) U Ci"(v3) U C"(vfJ, 
where each code C"'^(v^) contains only codewords of the 
form a-k (3. 

Thus, for each t = l,n the set of codewords of length t is 
composed of k} codewords of the form 0*0, fc| codewords 
of the form 0*1, kf codewords of the form 1*0 and kf 
codewords of the form 1*1. 

We start with an empty code C(vi) = 0. Suppose that a 
fix-free code C = C(vi_i) is constructed; we shall prove that 
on step t the code can be extended with fcj 0*0 codewords, kf 
0*1 codewords, fcf 1*0 codewords and fc^ 1*1 codewords of 
length t without violation of the fix-free property. By lemma 1 , 
it is sufficient to prove that 

\°V{C)®'F\c)\>kl 
\"r{C)®'F\c)\>kl 
\'l^iC)®'F\c)\>kf, 

\^V{c)®'f\c)\ >kl 

Put Si — S{vl_i). Note that, by construction, 5^ = and 
6i < >S'(v5J both imply Si+i = 0. We shall consider four 
possible cases: 

1) ,5i < i .52 =.53 =54 = 

2) 61 = 1,^2 < 1,^3 = ^4 = 

3) Si^l,S2^^,S3<^,6i = 

4) 5i = i,<52 = 1,63 = |,,54 < i 
In all the cases we use the fact that 

kl<2\S{vl^)-S,). (7) 

Case 3.1: Si < j, 62 = = 64 ~ 0. Using Q, we get 

kl < 2'-^ - Si * 2\ kf<2*-^ 
k^ < 2*-3, kf < 2*-3. 

It can be easily checked that 

= \t\c)\ = 2*-2 _ Si * 2*-i, 
|iJ?(C)| = |1?\c)|=2*-2. 
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The application of lemma 2 yields 

|0J*(C) "f^C)! > 2*-2 -5x* 2*-i > 2*-3 > fc2 
\'^f{C) (g) "f^C)! > 2*-2 -5x* 2*-i > 2*-3 > fc3 



This completes the proof of case 3.1. 

Case 3.2: (5i = i,52 < |,<^3 = ^4 = 0. By the same 
argument as above 

A;i=0, k'i < 2*-^ - 62 * 2\ 



kl < 2*-3, 



fc4 < 2*-3. 



We see that 



|OF^(C)| =2*-2-(i + ,52)*2*-i, 



|F (C7)| = 2*-3, 

I'f^C)! =2*-2-52*2*-i. 
By lenmia 2, we have 

\°^{C) ^ 'f\c)\ > 2*-3 - 52 * 2* > kl 

|i F*(C) O V)l > 2*-^ - S2 * 2*-i > 2*-3 > kf. 

This completes the proof of case 3.2. 

Case 3.3: 5i = \,S2 = |, (Js < |, <54 = 0. As above, 



fci - 0, 



fc2 = 0, 



fc3 <2t-3-53*2*, A;4<2*-3 



It is easily shown that 

|'f°(C)| =2*-3-53*2*-\ 
|iF^(C)| = 2*-2-53*2*-i 
I'f^C)! =3*2*-4. 

Applying lenmia 2, we obtain 

\^^{C) 'f°(C)| > 2*-3 - ^3 * 2* > kl 
\^^{C) "fV)! > 3 * 2*-4 - ^3 * 2*-i > 2*-3 > 

This completes the proof of case 3.3. 

Case 3.4: i5i = ^,62 = ^,63 = ^,64 < As above, 

fci = 0, kf = 0, 

kl = 0, k^ < 2*-3 - ^4 * 2*. 

One can easily see that 

|i F^(C)| = \t\c)\ = 2*-2 - (1 + J4) * 2*-i. 

By lemma 2, we have 

|i i?(C) (g) 1^\C)| > 2*-3 - ^4 * 2* > kf. 
This completes the proof of the theorem. 



III. Upper Bound for the Redundancy 
Theorem 2: For each probability distribution p — 
{pi, . . . ,Pm} there exists a binary fix-free code C where the 
average length of the codewords L{C) satisfies 

L{C) <H{p)+ 4 -log^ 5. 

Proof: By Zi, . . . , denote the codeword lengths. We define 

k = \- log2 Pt + 'i- l0g2 5] . 

It follows that 

mm _ m „ 

i=l 1=1 i=l 

By theorem 1 there exists a fix-free code C with the codeword 
lengths /i, . . . , /m- The average length of this code is 

m m 

= J2Pi*h<J2 Pi(- log2f>i + 4 - log2 5) = 

m 

H{p) + (4 - log2 5) E ft = H{p) + 4 - log2 5. 

i=l 

This completes the proof. 
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